Draft: jigsaw codecs: allow codecs to specify the buffers they work on#3529
Open
keewis wants to merge 9 commits intozarr-developers:mainfrom
Open
Draft: jigsaw codecs: allow codecs to specify the buffers they work on#3529keewis wants to merge 9 commits intozarr-developers:mainfrom
keewis wants to merge 9 commits intozarr-developers:mainfrom
Conversation
Contributor
Author
|
it just dawned to me that we can potentially split up the sparse codec (which is a array-to-bytes codec) into a array-to-array codec that extracts the metadata and component arrays of the sparse array and creates to specialized "multi-array buffer" for sparse arrays, and a generalized array-to-bytes codec that takes the "multi-array buffer" and packs it into bytes. This obviously means that the metadata we extracted has to live in the array-to-array codec's configuration. Then should we want a similar procedure for a different array type (e.g. masked arrays or geoarrow-encoded geometry arrays), we can just create a specialized pair of array-to-array codec and "multi-array buffer" type, and reuse the "multi-array to bytes" codec. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For my work on the sparse codec (and after discussing with @d-v-b and @jhamman at the zarr summit), I've noticed that it should be possible to have the codecs declare their input and output buffer types. The codec pipeline can then verify that the codecs form a chain of buffer types (kind of like jigsaw puzzle pieces), and infer the codec pipeline's buffer prototype as the input of the first array-to-array codec and the output of the last bytes-to-bytes codec.
TODO:
docs/user-guide/*.mdchanges/